Extracting structured information effectively and accurately from long unstructured text with LangExtract and LLMs. This article explores Google’s LangExtract framework and its open-source LLM, Gemma 3, demonstrating how to parse an insurance policy to surface details like exclusions.
This article explores alternatives to NotebookLM, a Google assistant for synthesizing information from documents. It details NousWise, ElevenLabs, NoteGPT, Notion, Evernote, and Obsidian, outlining their key features, limitations, and considerations for choosing the right tool.
Leveraging MCP for automating your daily routine. This article explores the Model Context Protocol (MCP) and demonstrates how to build a toolkit for analysts using it, including creating a local MCP server with useful tools and integrating it with AI tools like Claude Desktop.
Local Large Language Models can convert massive DataFrames to presentable Markdown reports — here's how.
This article details how to accelerate deep learning and LLM inference using Apache Spark, focusing on distributed inference strategies. It covers basic deployment with `predict_batch_udf`, advanced deployment with inference servers like NVIDIA Triton and vLLM, and deployment on cloud platforms like Databricks and Dataproc. It also provides guidance on resource management and configuration for optimal performance.
NVIDIA DGX Spark is a desktop-friendly AI supercomputer powered by the NVIDIA GB10 Grace Blackwell Superchip, delivering 1000 AI TOPS of performance with 128GB of memory. It is designed for prototyping, fine-tuning, and inference of large AI models.
This blog post introduces the Semantic Telemetry project at Microsoft Research, which uses a data science approach to analyze how people interact with AI systems, specifically focusing on Copilot in Bing usage. It discusses the complexity of human-AI interactions and how they differ from traditional search.
- Topics: Copilot in Bing chats were analyzed for topic categorization. Technology (21%) was the most common topic, followed by Entertainment (12.8%), Health (11%), and others. Within technology, programming and scripting were prominent subtopics.
- Platform Differences: Mobile users tend to use Copilot for personal tasks, while desktop users engage in more professional activities.
This tutorial demonstrates how to perform semantic clustering of user messages using Large Language Models (LLMs) by prompting them to analyze publicly available Discord messages. It covers methods for data extraction, sentiment scoring, KNN clustering, and visualization, emphasizing faster and less effort-intensive processes compared to traditional data science approaches.
A comprehensive guide to Large Language Models by Damien Benveniste, covering various aspects from transformer architectures to deploying LLMs.
- Language Models Before Transformers
- Attention Is All You Need: The Original Transformer Architecture
- A More Modern Approach To The Transformer Architecture
- Multi-modal Large Language Models
- Transformers Beyond Language Models
- Non-Transformer Language Models
- How LLMs Generate Text
- From Words To Tokens
- Training LLMs to Follow Instructions
- Scaling Model Training
- Fine-Tuning LLMs
- Deploying LLMs
- TabPFN is a novel foundation model designed for small- to medium-sized tabular datasets, with up to 10,000 samples and 500 features.
- It uses a transformer-based architecture and in-context learning (ICL) to outperform traditional gradient-boosted decision trees on these datasets.